Transcriptomics: Lecture 2

Biotech 7005/Bioinf 3000
Frontiers of Biotechnology: Bioinformatics and Systems Modelling
The University of Adelaide

Author
Affiliation

Dr Stevie Pederson (They/Them)
stevie.pederson@thekids.org.au

Black Ochre Data Labs
The Kids Research Institute Australia

Published

September 15, 2025

Welcome To Country

I’d like to acknowledge the Kaurna people as the traditional owners and custodians of the land we know today as the Adelaide Plains, where I live & work.

I also acknowledge the deep feelings of attachment and relationship of the Kaurna people to their place.

I pay my respects to the cultural authority of Aboriginal and Torres Strait Islander peoples from other areas of Australia, and pay our respects to Elders past, present and emerging, and acknowledge any Aboriginal Australians who may be with us today

RNA-Seq

RNA Sequencing

According to Wang, Gerstein, and Snyder (2009)

RNA-Seq, also called RNA sequencing, is a particular technology-based sequencing technique which uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a given moment, analyzing the continuously changing cellular transcriptome.

RNA Sequencing

  • Microarrays are still published regularly
    • Also used extensively for methylation
  • RNA sequencing is now the dominant technology
  • Strong improvement for:
    • transcript-level resolution
    • un-annotated genes
    • allelic bias
    • genomic variants

RNA Sequencing

  • Microarrays rely on probes for transcripts defined at design time
  • Restricted in the number of transcripts/genes targeted
    • Gencode 48 (GRCh38): 78,686 genes + 385,669 transcripts
  • Probes capture non-specfic binding
    • Affymetrix use 25-mer probes
    • Illumina arrays use 60-mer probes \(\implies\) sequences better targeted

These limitations do not exist for RNA-Seq

RNA Sequencing

  • Directly sequence the biological material
  • Map to most recent reference at any point in time
  • Assemble a transcriptome (tissue specific)
  • Detect InDels / SNPs in expressed sequences
  • Multiple variations
    • totalRNA or polyA transcripts \(\implies\) most similar to microarrays
    • smallRNA libraries
    • Long Reads (Oxford Nanopore, PacBio)
      \(\implies\) originally isoform discovery, quantitative methods improving

The RNA Population Of a Eukaryotic Cell

Image taken from Chan and Tay (2018)

The Key Steps

  • Focus from here on will be sequencing mRNA using short reads
  1. Library Preparation
    • RNA Quality assessment
    • Selecting target molecules
    • Adding sequencing primers
  2. Sequencing
  3. Aligmnent + Quantitation
  4. DE Gene Detection
  5. Downstream Analysis
  6. (Optional) Nobel Prize

RNA Selection

  • rRNA makes up about 80% of cellular RNA \(\rightarrow\) not of general interest
    • tRNA ~15% + mRNA ~5% cellular RNA1
  1. Select for poly-adenylated RNA using oligo-dT-based methods
    • Only extracts intact mRNA with a polyA tail (includes some ncRNA)

Image from https://www.lexogen.com/polya-rna-selection-kit/

RNA Selection

  1. Enzymatically deplete rRNA sequences
    • rRNA targeted using probes \(\implies\) dsRNA degraded
    • Can additionally target hbRNA (whole blood)

Image from https://support.illumina.com.cn

Library Preparation

  • RNA is then fragmented and size selected (200-300nt)
    • Very short transcripts always lost during this step
  • cDNA produced
  • Sequencing adapters added
    • Many adapters now contain Unique Molecular Identifiers (UMI)
    • Helps identify PCR duplicates

References

Chan, Jia Jia, and Yvonne Tay. 2018. “Noncoding RNA:RNA Regulatory Networks in Cancer.” International Journal of Molecular Sciences 19 (5). https://doi.org/10.3390/ijms19051310.
Wang, Zhong, Mark Gerstein, and Michael Snyder. 2009. RNA-Seq: A Revolutionary Tool for Transcriptomics.” Nat. Rev. Genet. 10 (1): 57–63.

Footnotes

  1. https://bionumbers.hms.harvard.edu/bionumber.aspx?s=n&v=5&id=100264↩︎